A Framework for Bus Trajectory Extraction and Missing Data Recovery for Data Sampled from the Internet
نویسندگان
چکیده
This paper presents a novel framework for trajectories' extraction and missing data recovery for bus traveling data sampled from the Internet. The trajectory extraction procedure is composed of three main parts: trajectory clustering, trajectory cleaning and trajectory connecting. In the clustering procedure, we focus on feature construction and parameter selection for the fuzzy C-means clustering method. Following the clustering procedure, the trajectory cleaning algorithm is implemented based on a new introduced fuzzy connecting matrix, which evaluates the possibility of data belonging to the same trajectory and helps detect the anomalies in a ranked context-related order. Finally, the trajectory connecting algorithm is proposed to solve the issue that occurs in some cases when a route trajectory is incorrectly partitioned into several clusters. In the missing data recovery procedure, we developed the contextual linear interpolation for the cases of missing data occurring inside the trajectory and the median value interpolation for the cases of missing data outside the trajectory. Extensive experiments are conducted to demonstrate that the proposed framework offers a powerful ability to extract and recovery bus trajectories sampled from the Internet.
منابع مشابه
A statistical analysis framework for bus reliability evaluation based on AVL data: A case study of Qazvin, Iran
Reliability is a fundamental factor in the operation of bus transportation systems for the reason that it signifies a straight indicator of the quality of service and operator’s costs. Todays, the application of GPS technology in bus systems provides big data availability, though it brings the difficulties of data preprocessing in a methodical approach. In this study, the principal component an...
متن کاملInvestigating the missing data effect on credit scoring rule based models: The case of an Iranian bank
Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملLane Change Trajectory Model Considering the Driver Effects Based on MANFIS
The lane change maneuver is among the most popular driving behaviors. It is also the basic element of important maneuvers like overtaking maneuver. Therefore, it is chosen as the focus of this study and novel multi-input multi-output adaptive neuro-fuzzy inference system models (MANFIS) are proposed for this behavior. These models are able to simulate and predict the future behavior of a Dri...
متن کاملApplication of the Response Surface Methodology for the Optimization of the Aqueous Enzymatic Extraction of Pistacia Khinjuk Oil
ABSTRACT: Aqueous enzymatic extraction of oil from pistacia khinjuk was performed using cellulase. The central composite design was used to optimize the parameters that are significant to the process. The influence of three regressors on the percentage of oil recovery from seed was evaluated using second-order polynomial multiple regression model. Analysis of variance showed a high coefficient ...
متن کامل